-
Notifications
You must be signed in to change notification settings - Fork 41
feat: add TTL-based cleanup strategy for DatabaseTriggers collection #464
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add TTL-based cleanup strategy for DatabaseTriggers collection #464
Conversation
Implements automatic cleanup of completed/failed triggers using MongoDB TTL index to prevent unbounded growth of the DatabaseTriggers collection. Changes: - Added expires_at field to DatabaseTriggers model for TTL tracking - Created MongoDB TTL index on expires_at with expireAfterSeconds=0 - Added TRIGGER_RETENTION_DAYS setting (default: 30 days, configurable via env) - Updated mark_as_triggered() to set expiration time on completed triggers - Updated mark_as_failed() to set expiration time on failed triggers - PENDING/TRIGGERING triggers remain without expiration (never cleaned up) Implementation: - MongoDB automatically deletes documents when expires_at timestamp is reached - TTL runs in background every 60 seconds (MongoDB default) - Retention period configurable via TRIGGER_RETENTION_DAYS environment variable - Only terminal states (TRIGGERED, FAILED) are marked for cleanup Tests: - Added comprehensive unit tests for TTL expiration logic - Tests verify expires_at is set correctly for both TRIGGERED and FAILED states - Tests verify custom retention periods are respected - All 4 new tests passing Also bumped python-sdk version to 0.0.3b2 Resolves exospherehost#433
SafeDep Report SummaryPackage Details
This report is generated by SafeDep Github App |
Summary of ChangesHello @spa-raj, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a crucial maintenance feature to automatically clean up old and completed or failed database triggers. By leveraging MongoDB's native TTL indexing, the system will now prevent the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds a configurable trigger retention setting and timezone-aware Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CR as trigger_cron
participant CFG as Settings
participant HT as handle_trigger
participant DB as DatabaseTriggers
participant CTRL as trigger_graph
CR->>CFG: read trigger_retention_hours
loop worker
CR->>HT: handle_trigger(cron_time, retention_hours)
HT->>DB: get_due_triggers(cron_time) -> set TRIGGERING
alt due trigger found
HT->>CTRL: call_trigger_graph(namespace, graph, body, request_id)
alt success
HT->>DB: mark_as_triggered(..., expires_at = now_utc + retention_hours)
HT->>DB: create_next_triggers(..., expires_at = next_time + retention_hours)
else error
HT->>DB: mark_as_failed(..., expires_at = now_utc + retention_hours)
HT->>DB: create_next_triggers(..., expires_at = next_time + retention_hours)
end
else no due triggers
HT-->>CR: return
end
end
Note over DB: TTL index `ttl_expires_at` on `expires_at` (expireAfterSeconds=0)\napplies only when trigger_status in (TRIGGERED, FAILED)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (4)
🧰 Additional context used🧬 Code graph analysis (1)state-manager/tests/unit/tasks/test_trigger_cron.py (3)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request effectively implements a TTL-based cleanup strategy for the DatabaseTriggers collection, which is a great improvement for preventing unbounded data growth. The changes are well-structured, including the new configuration setting, the database model update with a TTL index, and the logic to set the expiration on terminal-state triggers. The accompanying unit tests are comprehensive and cover the new functionality well.
My review includes a critical performance consideration regarding the repeated calls to get_settings() within functions that may be executed in a loop. I've also provided a suggestion to improve the maintainability of the new test code by reducing duplication. Overall, this is a solid contribution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (5)
python-sdk/exospherehost/_version.py(1 hunks)state-manager/app/config/settings.py(2 hunks)state-manager/app/models/db/trigger.py(2 hunks)state-manager/app/tasks/trigger_cron.py(3 hunks)state-manager/tests/unit/tasks/test_trigger_ttl.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
state-manager/app/models/db/trigger.py (2)
state-manager/app/config/settings.py (1)
Settings(7-27)state-manager/app/models/db/graph_template_model.py (1)
Settings(34-42)
state-manager/tests/unit/tasks/test_trigger_ttl.py (3)
state-manager/app/tasks/trigger_cron.py (3)
trigger_cron(97-100)mark_as_triggered(74-84)mark_as_failed(37-47)state-manager/app/models/db/trigger.py (1)
DatabaseTriggers(9-44)state-manager/app/models/trigger_models.py (1)
TriggerStatusEnum(9-14)
state-manager/app/tasks/trigger_cron.py (3)
state-manager/app/config/settings.py (1)
get_settings(34-38)state-manager/app/models/db/trigger.py (1)
DatabaseTriggers(9-44)state-manager/app/models/trigger_models.py (1)
TriggerStatusEnum(9-14)
Performance optimization: - Fetch settings once at trigger_cron() level instead of per-trigger - Pass retention_days as parameter to mark functions - Prevents repeated environment variable reads in high-volume scenarios - Eliminates performance degradation when processing triggers in loops Timezone fix (critical): - Use timezone-aware datetime.now(timezone.utc) instead of naive datetime.now() - Prevents shifted expirations on non-UTC hosts - PyMongo treats naive datetimes as UTC, causing incorrect TTL windows - All expires_at timestamps now explicitly UTC for consistency CANCELLED trigger cleanup: - Added mark_as_cancelled() function for future cancellation logic - CANCELLED triggers now expire under same retention policy as TRIGGERED/FAILED - Ensures all terminal states (TRIGGERED, FAILED, CANCELLED) are cleaned up Test improvements: - Refactored using pytest.mark.parametrize to reduce duplication - Added timezone-awareness assertions to verify UTC timestamps - Combined similar tests for all 3 terminal states - Reduced test code from ~120 lines to ~85 lines while increasing coverage - 6 parameterized tests passing (3 states × 2 test scenarios) Signed-off-by: Sparsh <sparsh.raj30@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
state-manager/app/tasks/trigger_cron.py(3 hunks)state-manager/tests/unit/tasks/test_trigger_ttl.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
state-manager/app/tasks/trigger_cron.py (3)
state-manager/app/models/db/trigger.py (1)
DatabaseTriggers(9-44)state-manager/app/models/trigger_models.py (1)
TriggerStatusEnum(9-14)state-manager/app/config/settings.py (1)
get_settings(34-38)
state-manager/tests/unit/tasks/test_trigger_ttl.py (3)
state-manager/app/tasks/trigger_cron.py (4)
trigger_cron(106-110)mark_as_triggered(84-93)mark_as_failed(37-46)mark_as_cancelled(48-57)state-manager/app/models/db/trigger.py (1)
DatabaseTriggers(9-44)state-manager/app/models/trigger_models.py (1)
TriggerStatusEnum(9-14)
🔇 Additional comments (6)
state-manager/tests/unit/tasks/test_trigger_ttl.py (2)
1-12: LGTM! Imports and docstring are clear.The imports correctly include
timezonefor timezone-aware datetime assertions, and the module docstring clearly describes the test purpose.
14-54: LGTM! Well-structured parametrized test.The parametrized approach successfully eliminates test duplication (addressing previous review feedback). The test correctly verifies:
- The update filter targets the correct trigger ID
- Both
trigger_statusandexpires_atare set- The
expires_atvalue is approximately 30 days from now- Timezone-aware UTC datetimes are used (lines 52-53)
The 2-second tolerance is appropriate for test timing variability.
state-manager/app/tasks/trigger_cron.py (4)
1-1: LGTM! Timezone import added.The
timezoneimport enables timezone-aware datetime creation, addressing the previous review concern about naive datetime usage.
37-46: LGTM! TTL logic correctly implemented.The function now:
- Accepts
retention_daysas a parameter (avoiding repeatedget_settings()calls, addressing previous performance concern)- Uses
datetime.now(timezone.utc)for timezone-aware timestamps (ensuring correct TTL behavior across timezones)- Sets both
trigger_statusandexpires_atin a single update
84-93: LGTM! TTL logic correctly implemented.Consistent with
mark_as_failed, this function now acceptsretention_daysand uses timezone-aware UTC timestamps for correct TTL behavior.
95-110: LGTM! Efficient settings propagation.The refactored design addresses the previous performance concern by calling
get_settings()once at the top level (line 108) and threadingretention_daysthrough the call chain. This avoids repeated environment variable reads in the loop.
- Add comprehensive docstring to mark_as_cancelled() explaining it's reserved for future cancellation feature implementation - Rename test_trigger_ttl.py to test_trigger_cron.py to match module name - Expand parametrized tests: test all 3 mark functions with 3 retention periods (9 combinations) for consistent coverage - Add 9 new tests covering all trigger_cron.py functions: - get_due_triggers (with/without triggers) - call_trigger_graph - create_next_triggers (success, DuplicateKeyError, other exceptions) - handle_trigger (success and failure paths) - trigger_cron orchestration - Total: 18 comprehensive tests with timezone-aware datetime assertions Signed-off-by: Sparsh <sparsh.raj30@gmail.com>
193bf32 to
5a9893a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
state-manager/app/config/settings.py(2 hunks)state-manager/app/models/db/trigger.py(2 hunks)state-manager/app/tasks/trigger_cron.py(3 hunks)state-manager/tests/unit/tasks/test_trigger_cron.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
state-manager/tests/unit/tasks/test_trigger_cron.py (3)
state-manager/app/tasks/trigger_cron.py (7)
trigger_cron(95-99)mark_as_triggered(73-82)mark_as_failed(37-46)get_due_triggers(16-27)call_trigger_graph(29-35)create_next_triggers(48-71)handle_trigger(84-93)state-manager/app/models/db/trigger.py (1)
DatabaseTriggers(9-53)state-manager/app/models/trigger_models.py (1)
TriggerStatusEnum(9-14)
state-manager/app/tasks/trigger_cron.py (4)
python-sdk/exospherehost/statemanager.py (1)
trigger(37-91)state-manager/app/models/db/trigger.py (1)
DatabaseTriggers(9-53)state-manager/app/models/trigger_models.py (1)
TriggerStatusEnum(9-14)state-manager/app/config/settings.py (1)
get_settings(34-38)
state-manager/app/models/db/trigger.py (2)
state-manager/app/config/settings.py (1)
Settings(7-27)state-manager/app/models/trigger_models.py (1)
TriggerStatusEnum(9-14)
Signed-off-by: Sparsh <sparsh.raj30@gmail.com>
- Changed the partial filter expression in DatabaseTriggers to exclude PENDING and TRIGGERING statuses. - Introduced an expires_at field in create_crons to set expiration time for triggers based on retention settings, ensuring proper cleanup of triggers after their designated retention period. This enhances the management of trigger states and ensures that only relevant triggers are retained in the database.
- Added "nin" to the list of ignored words in .codespellignorewords. - Ensured consistency by maintaining the existing format of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
.github/.codespellignorewords(1 hunks)state-manager/app/models/db/trigger.py(2 hunks)state-manager/app/tasks/verify_graph.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
state-manager/app/tasks/verify_graph.py (1)
state-manager/app/config/settings.py (1)
get_settings(34-38)
state-manager/app/models/db/trigger.py (2)
state-manager/app/config/settings.py (1)
Settings(7-27)state-manager/app/models/trigger_models.py (1)
TriggerStatusEnum(9-14)
🔇 Additional comments (5)
.github/.codespellignorewords (1)
17-18: LGTM!Adding "nin" to the ignore list is appropriate since it's a MongoDB operator (
$nin) used in the TTL index partial filter expression.state-manager/app/models/db/trigger.py (1)
16-16: LGTM!The
expires_atfield definition is correct. The optional type with defaultNoneis appropriate for triggers that shouldn't expire (PENDING, TRIGGERING).state-manager/app/tasks/verify_graph.py (3)
13-14: LGTM! Imports are correctly added.The new imports support the TTL-based retention functionality for cron triggers.
18-18: Verify that module-level settings initialization is intentional.The settings are loaded once at module import time. If the
TRIGGER_RETENTION_HOURSenvironment variable changes after the module is imported, the change won't be reflected without restarting the application.If runtime reconfiguration is desired, consider calling
get_settings()withincreate_crons()instead of at module level.
119-129: Implementation looks correct, pending resolution of retention period issue.The DatabaseTriggers instantiation properly includes the new
expires_atfield alongside existing fields. The logic correctly propagates the calculated expiration time to the database trigger record.Assuming the retention period calculation at line 117 is corrected per the previous comment, this implementation aligns with the PR objectives.
- Changed the default value of trigger_retention_hours from 24 to 720 in settings.py to extend the retention period for triggers, allowing for better management of trigger states.
…r setup - Updated the state manager setup documentation to include new environment variables: TRIGGER_WORKERS (default: 1) and TRIGGER_RETENTION_HOURS (default: 720) for better configuration of trigger management.
- Introduced a new module for initialization tasks that run when the server starts. - Implemented a task to delete old triggers from the DatabaseTriggers collection based on their status. - Updated the main application file to call the new initialization tasks during the lifespan of the FastAPI app, ensuring proper cleanup of outdated triggers at startup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
state-manager/app/config/settings.py (1)
41-42: Remove or replace the module-level settings snapshot with runtime calls to get_settings().The module-level
settings = get_settings()at lines 41-42 creates a stale snapshot at import time, preventing environment variable overrides from tests or process managers. The codebase already demonstrates the better pattern in most locations (e.g.,app/main.py:57,app/utils/check_secret.py:12,app/tasks/trigger_cron.py:97), whereget_settings()is called at runtime when needed. Additionally, the same problematic pattern exists inapp/utils/encrypter.py:7andapp/tasks/verify_graph.py:18, which also lock settings at module import time.-# Initialize settings -settings = get_settings() +# Avoid module-level snapshots; call get_settings() where needed for dynamic env overrides
♻️ Duplicate comments (1)
state-manager/app/config/settings.py (1)
16-16: Make retention default consistent with docs and add validation.Unify default to 720 and enforce non-negative hours to prevent immediate TTL deletion when misconfigured.
- trigger_retention_hours: int = Field(default=24, description="Number of hours to retain completed/failed triggers before cleanup") + trigger_retention_hours: int = Field( + default=720, # 30 days + ge=1, + description="Number of hours to retain completed/failed triggers before cleanup", + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
docs/docs/exosphere/state-manager-setup.md(5 hunks)state-manager/app/config/settings.py(2 hunks)state-manager/app/main.py(2 hunks)state-manager/app/tasks/init_tasks.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
state-manager/app/tasks/init_tasks.py (2)
state-manager/app/models/db/trigger.py (1)
DatabaseTriggers(9-52)state-manager/app/models/trigger_models.py (1)
TriggerStatusEnum(9-14)
state-manager/app/main.py (1)
state-manager/app/tasks/init_tasks.py (1)
init_tasks(16-20)
🪛 GitHub Actions: State Manager Unit Tests
state-manager/app/main.py
[error] 66-66: Beanie error: CollectionWasNotInitialized raised during lifespan startup while accessing DatabaseTriggers. Ensure Beanie is initialized before database collection usage. Command: uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml
[error] 66-66: Beanie error: CollectionWasNotInitialized raised during lifespan startup while accessing DatabaseTriggers. Ensure Beanie is initialized before database collection usage. Command: uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml
[error] 66-66: Beanie error: CollectionWasNotInitialized raised during lifespan startup while accessing DatabaseTriggers. Ensure Beanie is initialized before database collection usage. Command: uv run pytest tests/ --cov=app --cov-report=xml --cov-report=term-missing --cov-report=html -v --junitxml=full-pytest-report.xml
🔇 Additional comments (1)
state-manager/app/tasks/init_tasks.py (1)
6-14: No changes required. Review comment is factually incorrect.The review's claim that
get_pymongo_collection()is "not part of Beanie's public API in v2" contradicts Beanie's actual API design.get_pymongo_collection()is the official Beanie v2 method (renamed fromget_motor_collection()in v2) to access a model's underlying PyMongo Collection. The current code correctly uses this method. Additionally,get_collection()suggested in the review is a PyMongo Database method, not a Beanie Document method—it does not exist on the DatabaseTriggers class. NoCollectionWasNotInitializederrors or CI failures related to this code were found in the codebase.Likely an incorrect or invalid review comment.
- Updated create_next_triggers function to accept retention_hours as a parameter. - Introduced expires_at calculation to set expiration time for triggers based on retention settings. - Ensured proper handling of trigger creation with expiration in the database, improving trigger management.
- Changed the default value of MONGO_DATABASE_NAME from 'exosphere' to 'exosphere-state-manager' in the state manager setup documentation for clarity and accuracy in configuration.
- Changed the default value of trigger_retention_hours in settings.py from 24 to 720 to extend the retention period for triggers. - Updated test cases in test_main.py and test_trigger_cron.py to reflect the new retention_hours parameter in create_next_triggers function, ensuring consistency in trigger management across the application.
Implements automatic cleanup of completed/failed triggers using MongoDB TTL index to prevent unbounded growth of the DatabaseTriggers collection.
Changes:
Implementation:
Tests:
Also bumped python-sdk version to 0.0.3b2
Resolves #433